Linear-Space Substring Range Counting over Polylogarithmic Alphabets
نویسندگان
چکیده
Bille and Gørtz (2011) recently introduced the problem of substring range counting, for which we are asked to store compactly a string S of n characters with integer labels in [0, u], such that later, given an interval [a, b] and a pattern P of length m, we can quickly count the occurrences of P whose first characters’ labels are in [a, b]. They showed how to store S in O(n logn/ log logn) space and answer queries in O(m + log log u) time. We show that, if S is over an alphabet of size polylog(n), then we can achieve optimal linear space. Moreover, if u = npolylog(n), then we can also reduce the time to O(m). Our results give linear space and time bounds for position-restricted substring counting and the counting versions of indexing substrings with intervals, indexing substrings with gaps and aligned
منابع مشابه
On Geometric Range Searching, Approximate Counting and Depth Problems
In this thesis we deal with problems connected to range searching, which is one of the central areas of computational geometry. The dominant problems in this area are halfspace range searching, simplex range searching and orthogonal range searching and research into these problems has spanned decades. For many range searching problems, the best possible data structures cannot offer fast (i.e., ...
متن کاملForbidden substrings on weighted alphabets
In an influential 1981 paper, Guibas and Odlyzko constructed a generating function for the number of length n strings over a finite alphabet that avoid all members of a given set of forbidden substrings. Here we extend this result to the case in which the strings are weighted. This investigation was inspired by the problem of counting compositions of an integer n that avoid all compositions of ...
متن کاملA Simple Parallel Cartesian Tree Algorithm and its Application to Suffix Tree Construction
We present a simple linear work and space, and polylogarithmic time parallel algorithm for generating multiway Cartesian trees. As a special case, the algorithm can be used to generate suffix trees from suffix arrays on arbitrary alphabets in the same bounds. In conjunction with parallel suffix array algorithms, such as the skew algorithm, this gives a rather simple linear work parallel algorit...
متن کاملA new framework for addressing temporal range queries and some preliminary results
Given a set of n objects each characterized by d attributes speci ed at m xed time instances we are interested in the problem of designing space e cient indexing structures such that arbitrary temporal range search queries can be handled e ciently When m our problem reduces to the d dimensional orthogonal search problem We establish e cient data structures to handle several classes of the gener...
متن کاملData Structures for Restricted Triangular Range Searching
We present data structures for triangular emptiness and reporting queries for a planar point set, where the query triangle contains the origin. The data structures use near-linear space and achieve polylogarithmic query times.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1202.3208 شماره
صفحات -
تاریخ انتشار 2012